Cost-Benefit Arbitration Between Multiple Reinforcement-Learning Systems.
نویسندگان
چکیده
Human behavior is sometimes determined by habit and other times by goal-directed planning. Modern reinforcement-learning theories formalize this distinction as a competition between a computationally cheap but inaccurate model-free system that gives rise to habits and a computationally expensive but accurate model-based system that implements planning. It is unclear, however, how people choose to allocate control between these systems. Here, we propose that arbitration occurs by comparing each system's task-specific costs and benefits. To investigate this proposal, we conducted two experiments showing that people increase model-based control when it achieves greater accuracy than model-free control, and especially when the rewards of accurate performance are amplified. In contrast, they are insensitive to reward amplification when model-based and model-free control yield equivalent accuracy. This suggests that humans adaptively balance habitual and planned action through on-line cost-benefit analysis.
منابع مشابه
A Modular Reinforcement Learning Framework for Interactive Narrative Planning
A key functionality provided by interactive narrative systems is narrative adaptation: tailoring story experiences in response to users’ actions and needs. We present a datadriven framework for dynamically tailoring events in interactive narratives using modular reinforcement learning. The framework involves decomposing an interactive narrative into multiple concurrent sub-problems, formalized ...
متن کاملSwitch Packet Arbitration via Queue-Learning
In packet switches, packets queue at switch inputs and contend for outputs. The contention arbitration policy directly affects switch performance. The best policy depends on the current state of the switch and current traffic patterns. This problem is hard because the state space, possible transitions, and set of actions all grow exponentially with the size of the switch. We present a reinforce...
متن کاملFormalizing Assistive Teleoperation
In assistive teleoperation, the robot helps the user accomplish the desired task, making teleoperation easier and more seamless. Rather than simply executing the user’s input, which is hindered by the inadequacies of the interface, the robot attempts to predict the user’s intent, and assists in accomplishing it. In this work, we are interested in the scientific underpinnings of assistance: we f...
متن کاملCooperation in Stochastic Games
The aim of this study is to explore the phenomenon of cooperative learning in multiple agent stochastic game of Keepout. We intend to investigate whether for a given number of reinforcement learning agents, can cooperative agents outperform independent agents who do not communicate during learning. In that regard, we would like to work towards quantifying the benefits of cooperation in differen...
متن کاملA Novel Model for Arbitration between Planning and Habitual Control Systems
It is well established that humans decision making and instrumental control uses multiple systems, some which use habitual action selection and some which require deliberate planning. Deliberate planning systems use predictions of action-outcomes using an internal model of the agent’s environment, while habitual action selection systems learn to automate by repeating previously rewarded actions...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Psychological science
دوره 28 9 شماره
صفحات -
تاریخ انتشار 2017